with shadow]] Ray casting is the methodological basis for 3D CAD/CAM solid modeling and image rendering. It is essentially the same as ray tracing for computer graphics where virtual light rays are "cast" or "traced" on their path from the focal point of a camera through each pixel in the camera sensor to determine what is visible along the ray in the 3D scene.
The term "Ray Casting" was introduced by Scott Roth while at the General Motors Research Labs from 1978–1980. His paper, "Ray Casting for Modeling Solids", describes modeled solid objects by combining primitive solids, such as blocks and cylinders, using the set operators union (+), intersection (&), and difference (−). The general idea of using these binary operators for solid modeling is largely due to Voelcker and Requicha's geometric modelling group at the University of Rochester.
Before ray casting (and ray tracing), computer graphics algorithms projected surfaces or edges (e.g., lines) from the 3D world to the image plane where visibility logic had to be applied. The world-to-image plane projection is a 3D homogeneous coordinate system transformation, also known as 3D projection, affine transformation, or projective transform (homography). Rendering an image this way is difficult to achieve with hidden surface/edge removal. Plus, silhouettes of curved surfaces have to be explicitly solved for whereas it is an implicit by-product of ray casting, so there is no need to explicitly solve for it whenever the view changes.
Ray casting greatly simplified image rendering of 3D objects and scenes because a line transforms to a line. So, instead of projecting curved edges and surfaces in the 3D scene to the 2D image plane, transformed lines (rays) are intersected with the objects in the scene. A homogeneous coordinate transformation is represented by a 4×4 matrix. The mathematical technique is common to computer graphics and geometric modeling. . A transform includes rotations around the three axes, independent scaling along the axes, translations in 3D, and even skewing. Transforms are easily concatenated via matrix arithmetic. For use with a 4×4 matrix, a point is represented by X,, and a direction vector is represented by Dx,. (The fourth term is for translation, which does not apply to direction vectors.)
The idea behind ray casting is to trace rays from the eye, one per pixel, and find the closest object blocking the path of that ray—think of an image as a screen-door, with each square in the screen being a pixel. This is then the object the eye sees through that pixel. Using the material properties and the effect of the lights in the scene, this algorithm can determine the shading of this object. The simplifying assumption is made that if a surface faces a light, the light will reach that surface and not be blocked or in shadow. The shading of the surface is computed using traditional 3D computer graphics shading models. One important advantage ray casting offered over older scanline algorithms was its ability to easily deal with non-planar surfaces and solids, such as cones and . If a mathematical surface can be intersected by a ray, it can be rendered using ray casting. Elaborate objects can be created by using solid modelling techniques and easily rendered.
From the abstract for the paper "Ray Casting for Modeling Solids":
To visualize and analyze the composite solids modeled, virtual light rays are cast as probes. By virtue of its simplicity, ray casting is reliable and extensible. The most difficult mathematical problem is finding line-surface intersection points. So, surfaces as planes, quadrics, Torus, and probably even parametric surface patches may bound the primitive solids. The adequacy and efficiency of ray casting are issues addressed here. A fast picture generation capability for interactive modeling is the biggest challenge.
Light rays and the camera geometry form the basis for all geometric reasoning here. This figure shows a pinhole camera model for perspective effect in image processing and a parallel camera model for mass analysis. The simple pinhole camera model consists of a focal point (or eye point) and a square pixel array (or screen). Straight light rays pass through the pixel array to connect the focal point with the scene, one ray per pixel. To shade pictures, the rays’ intensities are measured and stored as pixels. The reflecting surface responsible for a pixel’s value intersects the pixel’s ray.
When the focal length, distance between focal point and screen, is infinite, then the view is called “parallel” because all light rays are parallel to each other, perpendicular to the screen. Although the perspective view is natural for making pictures, some applications need rays that can be uniformly distributed in space.
A ray is simply a straight line in the 3D space of the camera model. It is best defined in parameterized form as a point vector (X0, Y0, Z0) and a direction vector (Dx, Dy, Dz). In this form, points on the line are ordered and accessed via a single parameter t. For every value of t, a corresponding point (X, Y, Z) on the line is defined:
X = X0 + t · Dx
Y = Y0 + t · Dy
Z = Z0 + t · Dz
If the vector is normalized, then the parameter t is distance along the line. The vector can be normalized easily with the following computation:
Dist = √(Dx2 + Dy2 + Dz2)
D'x = Dx / Dist
D'y = Dy / Dist
D'z = Dz / Dist
Given geometric definitions of the objects, each bounded by one or more surfaces, the result of computing one ray’s intersection with all bounded surfaces in the screen is defined by two arrays:
'''Ray parameters:''' ''t''[1], ''t''[2], …, ''t''[n]
'''Surface pointers:''' S[1], S[2], …, S[n]
Where n is the number of ray-surface intersections. The ordered list of ray parameters, ti, denote the enter–exit points. The ray enters a solid at point t1, exits at t2, enters a solid at t3, etc. Point t1 is closest to the camera and tn is furthest.
In association with the ray parameters, the surface pointers contain a unique address for the intersected surface’s information. The surface can have various properties such as color, specularity, transparency with/without refraction, translucency, etc. The solid associated with the surface may have its own physical properties such as density. This could be useful, for instance, when an object consists of an assembly of different materials and the overall center of mass and moments of inertia are of interest.
Roth’s ray casting system generated the images of solid objects on the right. Box enclosures, dynamic bounding, and coherence were used for optimization. For each picture, the screen was sampled with a density of about 100×100 (e.g., 10,000) rays and new edges were located via binary searches. Then all edges were followed by casting additional rays at one pixel increments on the two sides of the edges. Each picture was drawn on a Tektronix tube at 780×780 resolution.
S × S × (''t''[2]-''t''[1] + ''t''[4]-''t''[3] + … + ''t''[n]-''t''[n-1]) / L
where L is defined as the length of the direction vector. (If already normalized, this is equal to 1.)
L = √(Dx2 + Dy2 + Dz2)
Each ( t i- ti-1)/L is a length of a ray segment that is inside of the solid.
This figure shows the parallelepipeds for a modeled solid using ray casting. This is a use of parallel-projection camera model.
The ray casting procedure starts at the top of the solid composition tree, recursively descends to the bottom, classifies the ray with respect to the primitive solids, and then returns up the tree combining the classifications of the left and right subtrees.
This figure illustrates the combining of the left and right classifications for all three binary operators.
This figure shows a table scene with shadows from two point light sources.
Shading algorithms that implement all of the realistic effects are computationally expensive, but relatively simple. For example, the following figure shows the additional rays that could be cast for a single light source.
For a single pixel in the image to be rendered, the algorithm casts a ray starting at the focal point and determines that it intersects a semi-transparent rectangle and a shiny circle. An additional ray must then be cast starting at that point in the direction symmetrically opposite the surface normal at the ray-surface intersection point in order to determine what is visible in the mirrored reflection. That ray intersects the triangle which is opaque. Finally, each ray-surface intersection point is tested to determine if it is in shadow. The “Shadow feeler” ray is cast from the ray-surface intersection point to the light source to determine if any other surface blocks that pathway.
Turner Whitted calls the secondary and additional rays “Recursive Ray Tracing”. A Whitted modeled refraction for transparencies by generating a secondary ray from the visible surface point at an angle determined by the solid’s index of refraction. The secondary ray is then processed as a specular ray. For the refraction formula and pictorial examples, see Whitted’s paper.
By using minimum bounding boxes around the solids in the composition tree, the exhaustive search for a ray-solid intersection resembles an efficient binary search. The brute force algorithm does an exhaustive search because it always visits all the nodes in the tree—transforming the ray into primitives’ local coordinate systems, testing for ray-surface intersections, and combining the classifications—even when the ray clearly misses the solid. In order to detect a “clear miss”, a faster algorithm uses the binary composition tree as a hierarchical representation of the space that the solid composition occupies. But all position, shape, and size information is stored at the leaves of the tree where primitive solids reside. The top and intermediate nodes in the tree only specify combine operators.
Characterizing with enclosures the space that all solids fill gives all nodes in the tree an abstract summary of position and size information. Then, the quick “ray intersects enclosure” tests guide the search in the hierarchy. When the test fails at an intermediate node in the tree, the ray is guaranteed to classify as out of the composite, so recursing down its subtrees to further investigate is unnecessary.
Accurately assessing the cost savings for using enclosures is difficult because it depends on the spatial distribution of the primitives (the complexity distribution) and on the organization of the composition tree. The optimal conditions are:
In contrast, the worst condition is:
The following are miscellaneous performance improvements made in Roth’s paper on ray casting, but there have been considerable improvements subsequently made by others.
To smooth the jagged edges in a shaded picture with subpixel accuracy, additional rays should be cast for information about the edges. (See Supersampling for a general approach.) Edges are formed by the intersection of surfaces or by the profile of a curved surface. Applying "Coherence" as described above via binary search, if the visible surface at pixel (X,Y) is different than the visible surface at pixel (X+1,Y), then a ray could be generated midway between them at (X+½,Y) and the visible surface there identified. The distance between sample points could be further subdivided, but the search need not be deep. The primary search depth to smooth jagged edges is a function of the intensity gradient across the edge. The cost for smoothing jagged edges is affordable, since:
The purpose of the grid based levels was twofold — ray-wall collisions can be found more quickly since the potential hits become more predictable and memory overhead is reduced. However, encoding wide-open areas takes extra space.
|
|